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Abstract 

We develop a technique for generating a set of optimized local basis functions to solve 
models in the Kohn-Sham density functional theory for both insulating and metallic systems. 
The optimized local basis functions are obtained by solving a minimization problem in an 
admissible set determined by a large number of primitive basis functions. Using the optimized 
local basis set, the electron energy and the atomic force can be calculated accurately with 
a small number of basis functions. The Pulay force is systematically controlled and is not 
required to be calculated, which makes the optimized local basis set an ideal tool for ab initio 
molecular dynamics and structure optimization. We also propose a preconditioned Newton- 
GMRES method to obtain the optimized local basis functions in practice. The optimized 
local basis set is able to achieve high accuracy with a small number of basis functions per 
atom when applied to a one dimensional model problem. 
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1. Introduction 

In scientific computation of systems with large number of degrees of freedom, an efficient 
choice of basis functions becomes desirable in order to reduce the computational cost. In 
this paper, we focus on the choice of efficient basis sets for the Kohn-Sham density func- 
tional theory (KSDFT) Q,^, which is the most widely used electronic structure theory for 
condensed matter systems. The methods and concepts illustrated here are also useful for 
other applications. 
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In KSDFT, the quantities of interest are the electron energy E{R) and the atomic force 
F{R). Here we denote hj R = (i?i,i?27 ■ ■ ■ ,Rna)^ the atomic positions, where A^^ is the 
number of atoms. The atomic force is expressed in terms of the derivatives of the electron 
energy with respect to the atomic positions as F{R) = This is an important quantity 

in many applications including structure optimization and first principle molecular dynamics. 
The electron energy is a functional of a set of Kohn-Sham orbitals {ipi}fLi where is the 
number of electrons in the system. To illustrate the idea with minimal technicality, let us 
consider for the moment a system of non-interacting electrons at zero temperature. The 
energy functional for non-interacting electrons takes the form 

E{{Ux)}l,;R) = -J2 dx+ / V{x-R)J2\M^)f dx. (1) 

^ i=i i=i 

The first term and the second term in ([T]) are the kinetic energy and the potential energy of 
the system, respectively. The energy E{R) as a function of atomic positions is given by the 
following minimization problem 

E{R)= min E {{^Pi{x)}l,; R) , 

r (2) 
s.t. '4)*{x)i)j{x)dx = Sij, i,j = l,...,N. 



We denote by {ipiix; R)}fL^ the minimizer. It can be readily shown that {ipi{x; R)}^i are 
the lowest eigenfunctions of the Hamiltonian operator H{R), which takes the form 

H{R) = -^A^ + V{x;R). (3) 

Using the Hamiltonian operator, the electron energy has an alternative expression without 
the explicit dependence on the orbitals {ipi}f^i: 

E{R) = Tr [H{R)x{H{R) - fi{R))] ^ Tr[go{H{R))], (4) 

where x(') is the Heaviside function: x(^) = 1 if x < and is otherwise. Here fi{R) is the 
chemical potential, which takes value between the A^-th and (A^ + l)-th eigenvalues of H to 
control the number of electrons. 

Since all the quantities depend on the atomic positions R, to simplify the notation we drop 
the dependence of R unless otherwise specified. If we approximate the eigenfunctions {ipi}iLi 
by linear combination of a set of basis functions $ = (0i, ■ ■ ■ , (pNi,), the Hamiltonian operator 
H is discretized into a finite dimensional matrix ^^H^ (here and in the following, we will 
use the linear algebra notation: = {(j)i\H\(j)j)). The number of basis functions A'^;, is 

therefore called the discretization cost. The electron energy and the force can be expressed 



2 



in terms of the discretized Hamiltonian operator as 



= Tr [go{<^^H<i> 
dE^ 



dRi 
Tr 



2Tr 



is the J-th component of the force. In what follows the second equation in 
written in a compact form as 



(5) 



is also 



dE^ 
'~dR 



Tr 



2Tr 



(6) 



Choosing basis functions $ adaptively with respect to the atomic positions R has obvious 
computational advantages, as it allows the possibility to reduce the discretization cost by a 
significant amount while maintaining the accuracy for the evaluation of the electron energy 
and atomic forces. Since the electron energy is defined variationally as in ([2]), an accurate ba- 
sis set should minimize the electron energy. However, choosing the basis functions adaptively 
gives arise to some difficulties in the evaluation of the force (j5]) which requires the calculation 
of 1^. In electronic structure theory, the contribution from |^ is referred to as the Pulay 
force [ij. We will henceforth adopt this terminology. The Pulay force originates from the 
incompleteness of the basis set, and has been found to be important to obtain the force 
with reliable accuracy for structure optimization or first principle molecular dynamics js], 01 • 
The calculation of the Pulay force can be quite expensive even if the basis functions $ have 
analytical expressions, and the calculation of the Pulay force becomes almost intractable if 
the basis functions are defined implicitly such as in the adaptive mesh method 
We would like to systematically reduce the Pulay force so that the approximation 



dE^ 
OR 



Tr 



(7) 



becomes adequate. 

The key observation in this paper is that minimizing the electron energy and reducing 
the Pulay force can be simultaneously achieved by the following optimization procedure 



min E. 

<S>cV,'i>T<S>=l 



$ = min Tr Igoi^^H^ 
<i>cv,<i>T$=/ '- 



Here V is an admissible subset of the space spanned by a set of primitive basis functions 
which are independent of R. Later V will be referred to as the admissible set. We select from 
V a small number of i?-dependent optimized basis functions $ = ■ ■ ■ ,4'Nt) which give 
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rise to the lowest electron energy in V. The Euler-Lagrange equation for the minimization 
problem ([H]) reads 



(9) 



where the matrix A is a Lagrangian multiplier and is symmetric. When the first optimality 
condition (l9l) is satisfied, we find 



2Tr 



2Tr 



Tr 



A 



OR 



0. 



(10) 



The last equality comes from the orthonormal constraint on the optimized basis functions 
The reason why (ITOl) holds can be understood from the variational structure of the original 
problem ([8]), which is related to the Hellmann-Feynman theorem in quantum mechanics. As 
a result, the Pulay force vanishes in the atomic force even if the optimized basis functions 
are far from being a complete basis set. 

The choice of the primitive basis functions is crucial. Although the optimized basis 
functions are always incomplete due to the small number of basis functions used, the primitive 
basis set should be systematically improvable towards a complete basis set. Each primitive 
basis function should be local in order to be suitable for large scale parallel calculation. In 
our previous work [ij, the primitive basis set is constructed using a discontinuous Galerkin 
(DG) framework. The DG primitive basis set allows the usage of basis functions that are 
discontinuous across element surfaces. Each DG primitive basis function is local in the real 
space, and thus gives full flexibility in the choice of the optimized basis functions. The 
locality constraint in the real space can therefore be naturally applied to the optimized basis 
functions, giving rise to the optimized local basis set. 

We remark that a large primitive basis set also presents practical difficulties for the 
optimization procedure. In this paper we propose a preconditioned Newton-GMRES method 
to obtain the optimized local basis functions. Numerical results using a one dimensional 
model problem validate the performance of the optimized local basis functions: the electron 
energy and the force can be accurately calculated along the trajectory of the molecular 
dynamics without systematic drift, using a very small number of basis functions per atom. 

Improving the quality of the basis functions via variational optimization has been previ- 
ously studied in the electronic structure theory. However, to the best of our knowledge all 
the optimized basis functions presented so far use atom-centered primitive basis functions, 
such as atomic orbitals or Gaussian-type orbitals. Since atomic orbitals or Gaussian-type 
orbitals depend on the atomic positions and do not form a complete basis set, the Pulay 
force never vanishes. The Pulay force of all the primitive basis functions should be com- 
puted for each atomic configuration. Moreover, optimization for each atomic configuration 
is generally considered to be an expensive procedure, and the optimized basis functions are 
usually obtained for specific reference systems instead. For example, Junquera et al 10|] pro- 



posed to optimize the shape and cutoff radii of a set of numerical atomic orbitals; Ozaki [11 



proposed using the optimal linear combination of a set of numerical atomic orbitals; Blum 



et al [12| used a greedy method to select basis functions from a large pool of numerical 
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atomic orbitals. The drawback of this procedure is that the quahty of the basis functions 
depends heavily on the choice of the reference system. The transferabihty of these basis sets 
obtained for specific reference systems should be tested carefully for a variety of systems. 
Optimized basis functions without the choice of reference systems have also been studied 



before. Talman 13| proposed to optimize a set of numerical atomic orbitals for all the atoms 



simultaneously. Rayson and Briddon tried to find the optimal linear combination of 
Gaussian-type orbitals, where the optimization process loops over each atom in the system. 
These methods share similar spirit as the present work, and can be regarded as approximate 
strategies towards achieving optimality in practice. 

Our current work avoids the subtle issue of transferability by means of an optimization 
procedure for any given system, which could be advantageous for complex systems where 
manually constructed transferable basis functions are difficult to be obtained. The DG 
primitive basis set is a complete basis set, and the optimized local basis functions are local 
by construction. The DG primitive basis set is independent of the atomic positions, and the 
Pulay force vanishes when the optimality condition is reached. 

The rest of the paper is organized as follows. In Section |2l we introduce the optimized 
local basis set for KSDFT. Numerical examples are presented in Section [21 followed by 
discussion and conclusion in Section HJ To make the paper self-contained, we briefly recall 



the finite temperature Kohn-Sham density functional theory in Appendix A 



2. Optimized local basis function 

As introduced in our previous work |9|, using a discontinuous Galerkin method (the 



interior penalty method 15|, ll6|), the effective energy functional in Kohn-Sham density 



functional theory is given by 

i i 

+ « E ( [M ' [M >5 + E K^^' ^^)ri' (11) 

i i i 

+ Ml -/.))• 

i 

This is a discretization method for the Helmholtz free energy flA.181) for a system at tem- 



perature see Appendix A for details of formulation of Kohn-Sham density functional 
theory in finite temperature. Here T is a collection of quasi-uniform rectangular partitions 
of the computational domain: 

r= {^i,^2,--- ,^m}, (12) 

and S be the collection of surfaces that correspond to T. (■, and (■, ■)g are inner products 
in the bulk and on the surface respectively. The notations {{ ■ }} and [[ ■ ]] are used for the 
standard average and jump operators across surfaces in the interior penalty method. We 
refer to [9l for more details. 
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Let $ be a chosen set of basis functions $ = {^k,j}jti^ where each (p^j is supported in 
Ek and Jk is the total number of basis functions in E^.. The corresponding approximation 
space V$ is given by 

Vd, = span{v9fcj, Ek e T, j = 1, - ■ ■ , Jk}- (13) 
The approximated Kohn-Sham orbitals are the solutions to the minimization problem 

{i/'i}cv<i>,{/i} 

f ^ (14) 

s.t. ij*ipjdx = 5ij, z, j = 1, ■ ■ ■ , A^, 



where is chosen to be slightly larger than the number of electrons in the system in 



order to compensate for the finite temperature effect (see Appendix A for more detailed 
explanation). We propose the optimized local basis functions which give rise to a specific 
choice of $, in order to achieve accuracy for both the Helmholtz free energy and the force 
while using a small number of basis functions. Following the spirit of (IE]) introduced for 
the model problem in the introduction, the optimized local basis function set $ solves the 
following minimization problem 

min min J^Dciitpi}, {fi})^ (15) 

<i>cv,i>T^=i{i,,}cv^,{f,} 

where V is the admissible set. To define the admissible set, we take for each element E^ a 
set of basis functions {ukj,j = 1, ■ ■ ■ , Jk}- Each Ukj is compactly supported in E^, and they 
satisfy the orthonormality condition 

{uk'j',Ukj)j- = Skk'Sjj'- (16) 

For example, {ukj} can be polynomials restricted to the set Ek up to a certain order. Other 
forms of primitive basis functions can be chosen as well, without changing the discussion 
that follows. The discretized Hamiltonian in the DG formulation takes the form 

Hk'j'Aj =\ (V?/fc'j', Vufc,,)^ - ^ ( , {{ Vufcj}})^ 

- \ ({{ V«fc',,'}}, >5 + « ( [hfc',,']] , [h,,]] >5 (17) 

The optimized local basis functions should be local in the real space in order to facilitate 
large scale computation. Since {u^j} are compactly supported in E^, the locality constraint 
on the optimized local basis functions is naturally imposed by requiring each function in the 
admissible set to be linear combinations of {ttfcj} for the same /c, i.e. 

M 



V = y span{ufc,j, j = 1, ■ ■ ■ , Jfc}, 



k=l 
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where M is the number of elements. 

Inside each element E^, we select optimized local basis functions from the admis- 
sible set. N}^ is much smaller than J^. The optimized local basis functions are denoted 
by {0fc,i, ■ ■ ■ ,0A;,Affe}, and are represented by the linear combination of the primitive basis 
functions 

Jk 

4>k,i = ^ 4>k,i,jUk,j, I = !,■■■ ,Nk. 
j=i 

With slight abuse of notation, we use (j)k,i also for the column vector of the coefficients in 
the primitive basis functions: 

T 

(19) 



'Pk,i = y(Pk,i,i (pk,i,2 ■ ■ ■ (Pk,i,jK 

If we write 

= 4>k,2 ■ ■ ■ 4>k,Nk) , (20) 

the optimized local basis set $ represented in the primitive basis set takes the form 

$ = diag(<l>i,$2,--- ,$A/). (21) 

Because of the block diagonal structure, the orthonormality constraint = / is equivalent 
to the orthonormal constraint for each ^k, i-G- , ^k^k = Ik, k = 1, ■ ■ ■ , M. Here each block 
$j is a rectangular matrix of size A^^ x N^, where A^^^ is the number of grid points in the 
element, and is the number of basis functions. 1^ is an x A^^ identity matrix. 

Under the basis set $, the discretized Hamiltonian becomes ^^H^ with H given by ( |T71) . 
The Helmholtz free energy can be written without the explicit dependence on {tpi} and {/i}: 

min J-dg(M},{/.}) =Tr(7($Tif<|.)+/.iV, (22) 

{i>i}cv.s>,{fi} 

where the function g, which is a finite temperature version of go, is defined as 

^(x) = -/3-Mn(l +exp(/3(/i-x))). (23) 
Note that the derivative of g is the Fermi-Dirac function 

(7'(x) = (l + exp(/3(x-/i)))-\ (24) 
Hence, the minimization problem (fT5|) becomes 

J'dg = min [Tr ^($^i/$) + fiN] , 

s.t. $J$fc = 4, k = l,---,M. 
The atomic force is then given by 

dR 

= - Tr(p<,<|.^||$) - 2 Tr(p^$^i7||) (26) 
^-Tr(p..^f.), 



(25) 
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where pq, = g'l^'^H^) is the single particle density matrix associated to the discretized 
Hamiltonian ^^H^. pij, can be evaluated using standard diagonalization techniques by 
computing the eigenvalues and eigenvectors of the reduced Hamiltonian This is 

asymptotically the most time consuming step which scales as 0{N^) where N is the number 
of electrons in the system. For the ID system considered in this manuscript, is solved by 
the MATLAB diagonalization subroutine eig. For systems of large size, the diagonalization 
routine can be replaced by the recently developed low order scaling selected inversion meth- 



ods [17|, Il8[ to reduce the computational cost. The Pulay force vanishes in the last equality 
when the first order optimality of the optimization problem ( 125|) is reached, following the 
same reasoning as in (fTOl) . 

The Euler-Lagrange equation with respect to the minimization problem (125!) reads 

'i7$p$-$A = 

$T$-/ = ' ^ ' 

where the A is a block diagonal matrix 

A = diag(Ai, A2, ■ ■ ■ , Am), 

which is the Lagrange multiplier for the orthonormal constraints. Due to the block diagonal 
structure of $, we can write the first order optimality condition (1271) as 

J2 Hij^jP^ji - <^'^A^ = 0, « = 1, ■ ■ • , M. (28) 

i 

Define the remainder for the i-th element as 

ii.(*.A)=(S.««*^^-*A). (29) 

We solve i?i($, A) = for i = 1, 2, ■ ■ ■ , M. 

In order to solve the nonlinear system f l27p . we propose a preconditioned Newton-GMRES 
method as follows. Denote by J the Jacobian matrix. At the /-th iteration, the Newton step 
solves the following linear system for the correction term 

To make the optimization feasible in practice, we take the following approximation. We 
neglect the derivative of p$ = g'{<^^H(^) with respect to $ in the Jacobian. The most 
important reason for this approximation is that the numerical evaluation of such derivative is 
quite expensive. In practice we find that the residue of the Euler-Lagrange equation decays 
fast in the first few Newton iterations, and slows down when the residue becomes small, 
suggesting that the derivative of p$ with respect to the basis functions can be important 
especially for the small residue case. Numerical results indicate that the accuracy of the 
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Helmholtz free energy and the force can already be improved by one order of magnitude 
after a few Newton iterations. Further improvement that includes the approximate form of 
the derivative of p$ will be considered in the future work. Using this approximation, the 
correction equation fl30|) can be written explicitly as 

E, i/.,(A$),p^,,, - (A$),A, - $.(AA)A 



for 2 = 1,2, ■ ■ ■ ,M. 

We solve the linear system (131 p using a preconditioned GMRES method. The GMRES 



method [19| is a robust way for solving ill-conditioned linear equations. The preconditioner 



should give an approximate solution efficiently for the following equation 
fZj i/.,(A$),p^,,, - (A$),A, - <I>.(AA)A _ _ fB,\ 

{ -$j(A$), - (A$)^$, )- [aj ^''^ 

for any right hand side {-Bj}, {C*i}- To this end we first neglect the interaction between 
different elements: 



iJ,,(A$),p$,, - (A$)iA, - $,(AA)A _ _ [B, 
-<|.T(A$), - (A$)T$, ) [a 



(33) 



The equations of (A$)i for different elements become decoupled. (!33l) can be therefore solved 
independently in each element. Second, we note that there are degeneracy issues solving (133|) . 
This is because in the subspace spanned by the basis V<s>, only the low- lying eigenfunctions 
of the discrete Hamiltonian affect the free energy much, while the eigenfunctions with large 
eigenvalues do not contribute much due to small occupation number. Therefore, if we change 
the subspace V$ in the direction of these high energy eigenfunctions, it does not change much 
the energy, which causes degeneracy. 

We propose the following pruning method to solve the degeneracy problem. Instead of 
solving fl5^ . we restrict to the basis functions contributed to the low-lying eigenfunctions by 
the following procedure. Given density matrix p$, for each element Ei, we take a singular 
value decomposition of the diagonal block of 

P<!>,n = UiS.Uj, (34) 

with the singular values sorted in descending order. Then according to magnitude of the 
singular values, we write Ui = (t/f , Ul), where the singular vectors in Uj^ correspond to high 
singular values above a certain threshold, and the ones in Ul correspond to low singular 
values below the threshold. The basis functions in the element can be separated into two 
accordingly: 

$f = ^,ul\ = ^iU^. (35) 

We now only update the correction term corresponding to the high singular values by solving 



H,,{A<I>)1pI,, - (A<|.)f Af - <^1{AA)^\ BJJl^ 



(36) 
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where 

The approximate solution of the preconditioning equation (132!) is therefore given by 

A<1>, = A$f (t/f)^, AA, = f/f AAf (t/f (37) 

As will be seen in the numerical examples in Section [3l the preconditioned Newton- 
GMRES method is able to obtain the optimized local basis functions efficiently with a small 
number of iterations. 



3. Numerical result 

3.1. Setup 

The accuracy and efficiency of the optimized local basis functions is illustrated using 
a one-dimensional model problem as follows. The number of atoms in the one- dimensional 
model problem is denoted by A^^, the positions of electrons by x, and the positions of ions by 
R = (i?i, i?2, ■ ■ ■ , RnaY ■ The electronic and ionic degrees of freedom are separated by the 
Born-Oppenheimer approximation. The effective Kohn-Sham Hamiltonian of the electrons 
for a given atomic configuration R is 

H{R) = -]^^ + V{x]R). (38) 

The effective electron-ion interaction and electron-electron interaction is modeled by the 
summation of a series of Gaussian functions 

V{x-R) = -^j=Y.e-^. (39) 



1=1 



A and a characterize the height and the width of the potential well around each atom, re- 
spectively. For simplicity, the effective Hamiltonian does not depend on the electron density, 
and hence self-consistency iteration is not involved. The self-consistent iteration will be 
added in the future work. The ion-ion interaction is modeled by a harmonic potential with 
periodized nearest-neighbor interaction 

1 ^^"^ 1 
1=1 

with L being the length of the computational domain. The force on atom / is 

dJ'j.oiR) dVn{R) 



dRi dRi 



(41) 



The finite temperature KSDFT is used here and the Helmholtz free energy for the electrons 
J-dg(-R) is given by (125|) . The finite temperature effect is usually negligible in insulating 
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systems with large band gap, but becomes important for the stabihty in metalhc systems 
with small or vanishing band gap. 

The accuracy is measured in terms of the error of the Helmholtz free energy per atom and 
the error of the force. For a given atomic configuration, the Helmholtz free energy per atom 
and the force are calculated independently using the optimized local basis functions and the 
benchmark plane wave basis functions. Except for the unit of temperature which is Kelvin, 
atomic units are used throughout this section unless otherwise specified. In particular, the 
unit of energy is Hartree, the unit of force is Hartree/Bohr, and the electron mass m, electron 
charge e and the Planck constant h are set to be unity. The detailed choices of the parameters 
in the simulation are as follows. Except in the last example where we test for different system 
sizes, the number of atom is taken to be Na = 8. The average distance between adjacent 
atoms is 10 au, and the size of each element is also set to be 10 au. The initial guess 
of the optimized local basis functions uses the adaptive local basis functions proposed in 
our previous work j9|. The adaptive local basis functions use a small buffer region outside 
each element. The buffer size is 5 au in the present calculation. We compare the electron 
energy and the forces produced by the optimized local basis functions with those obtained 
from a planewave calculation with kinetic energy cutoff at E^ut = 40 Ry, or 20 planewaves 
per atom. The change of the Helmholtz free energy and the force is less than 10~^ au if 
the kinetic energy cutoff for the planewave calculation is further increased. 21 Legendre- 
Gauss-Lobatto (LGL) grid points per element are used to discretize the optimized local 
basis functions as well as the adaptive local basis functions. The change of the Helmholtz 
free energy and the force is less than 10~^ au if the number of LGL integration points is 
further increased. Therefore the numerical integration error is negligible, and the error in 
the calculated Helmholtz free energy and the force faithfully represents the error due to 
the usage of adaptive local basis functions or optimized local basis functions. The electron 
temperature is 2000 K. The penalty parameter a in the DG Hamiltonian is 40. The choice 
of parameters for the potential energy surface is a; = 0.03, A = 5.0, a = 4.0. 

If one electron is assigned to each atom (spin degeneracy is neglected), then the band gap 
at the equidistant configuration is around 14000 K, which is much larger than the electron 
temperature (2000 K). In what follows this system is referred to as the insulating system. 
If four electrons are assigned to each atom, the band gap is is essentially zero (0.5 K). The 
energy levels around the Fermi surface are fractionally occupied due to the thermal effect. 
This system is referred to as the metallic system. 

In the optimization of the local basis functions, the maximum number of Newton itera- 
tions is set to be 4, and the maximum number of iterations for the preconditioned GMRES 
solver for the Newton's equation is set to be 30. We find that the error for solving the linear 
system fl3T]) using 30 preconditioned GMRES iterations is less than 10~^. The threshold 
value for the significant part of the basis functions is set to be 10~^ to avoid degeneracy. The 
preconditioning step is solved by direct LU decomposition method inside each element. 

3.2. Static case 

We first illustrate the performance of the optimized local basis set in the static case. 
20 atomic configurations are generated from equidistant configuration with small random 
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perturbations. The accuracy of using the optimized local basis set is measured by the mean 
absolute value of the error (mean error) of the Helmholtz free energy per atom and the mean 
error of the force of a fixed atom. Besides the optimized local basis functions, the error of 
using the adaptive local basis functions jof is presented as well to illustrate the effectiveness 
of the optimization procedure. 

For the insulating system, the relative error of the force is already 0.5% with as small as 4 
basis functions per atom using the optimized local basis functions (Tabled]). When compared 
to the adaptive local basis functions with the same number of basis functions per atom, the 
error of the Helmholtz free energy per atom is reduced by 51 times, and the error of the 
force is reduced by 14 times after the optimization procedure. It is illuminating to see the 
difference between the adaptive local basis functions and the optimized local basis functions. 
Since any unitary transformation of the basis functions in each element does not change the 
total energy of the system, the basis functions should first be rotated according a certain 
criterion. Here we rotate the basis functions in an element according to the Ritz values of 
the Hamiltonian in the same element. Take the first element for example, the Hamiltonian 
operator is denoted by Hu, and the basis functions in the first element is denoted by $i. 
We solve the following eigenvalue problem 



where Ai is a diagonal matrix with values sorted in ascending order. Then we compare the 
rotated basis functions 



for adaptive and optimized local basis functions in Fig. [H It is found that the optimized 
local basis functions are very close to the adaptive local basis functions, indicating that the 
adaptive local basis functions is already very accurate in computing the total energy of the 
system. The agreement between the adaptive local basis functions and the optimized local 
basis functions is very well for basis functions of low energy (Fig. [D (a)), and the difference 
enlarges for basis functions or higher energy. This can be understood as that the adaptive 
local basis functions include contributions from unoccupied states with relatively high energy 
level, while the optimized local basis functions reduce the contribution from such unoccupied 
states by the optimization procedure. 

Similar results are found for metallic systems (Table [2]). More basis functions are needed 
in this case since there are more electrons in the metallic system than those in the insulating 
system studied here. The relative error of the force is 0.2% with 8 basis functions per 
atom using the optimized local basis functions. When compared to the adaptive local basis 
functions using the same number of basis functions, the error of the Helmholtz free energy 
per atom is reduced by 10 times and the error of the force is reduced by 60 times using 
the optimized local basis functions. The optimized local basis functions therefore greatly 
improve the accuracy with the same number of basis functions. 

On the other hand, the accuracy of using the adaptive local basis functions can be 
systematically improved by increasing the number of basis functions per atom. For example, 
if the number of basis functions per atom is increased from 8 to 12 for the metallic system. 



($f i/n*i)Ci = CiA 



(42) 



(43) 
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Method 


AJ-DG / atom 


Absolute AF 


Relative AF 


Adaptive 


5.7 X 10-^ 


6.8 X 10-5 


4.5 X 10-3 


Optimized 


1.1 X 10^6 


4.9 X 10-6 


3.3 X 10-^ 



Table 1: Mean error of the Helmholtz free energy per atom, the absolute error of the force 
of the first atom, and the relative error of the force of the first atom. This system is an 
insulating system with 1 electron per atom (spin neglected). 4 basis functions per atom are 
used for both the adaptive local basis functions and the optimized local basis functions. 



Method 


AJ^DG / atom 


Absolute AF 


Relative AF 


Adaptive 


1.4 X 10-3 


3.8 X 10-^ 


9.6 X 10-2 


Optimized 


1.7 X 10-^ 


4.5 X 10-*^ 


1.6 X 10-3 



Table 2: Mean error of the Helmholtz free energy per atom, the absolute error of the force of 
the first atom, and the relative error of the force of the first atom. This system is a metallic 
system with 4 electrons per atom (spin neglected). 8 basis functions per atom are used for 
both the adaptive local basis functions and the optimized local basis functions. 

the accuracy of using the adaptive local basis functions is comparable to that of using the 
optimized local basis functions (Table E]). This finding is fully consistent with the previous 
work [qI that the adaptive local basis functions also form an accurate and efficient local 
basis set for the electronic structure calculation. The mild increase of the number of basis 
functions indicates that the adaptive local basis functions are already very efficient at least 
for ID or quasi- ID systems. It is also found in the previous work that the number of adaptive 
local basis functions increases considerably from quasi-lD systems to 3D bulk systems [ij. 
We expect that the number of basis functions can be reduced by a significant amount using 
optimized local basis functions in 3D bulk systems. 



Method 


AJ-'dg / atom 


Absolute AF 


Relative AF 


Adaptive 


3.4 X 10-5 


1.9 X 10-^ 


1.1 X 10-^ 


Optimized 


3.4 X 10-5 


1.7 X 10-^ 


1.0 X 10-^ 



Table 3: Mean error of the Helmholtz free energy per atom, the absolute error of the force of 
the first atom, and the relative error of the force of the first atom. This system is a metallic 
system with 4 electrons per atom (spin neglected). 12 basis functions per atom are used for 
both the adaptive local basis functions and the optimized local basis functions. 

We also test the optimized local basis functions on a system with local defects. The 
defect system is obtained by choosing the parameter a at one atom in the potential (!39|) to 
be different from the parameters a of the rest of the atoms. The system contains 8 atoms 
with 4 electrons and 8 basis functions per atom. The parameter a is set to be 5.0 for all 
atoms except for the first atom which is set to be 3.0. The error of the Helmholtz free energy 
per atom and the error in the force of the defect atom are comparable to those in the periodic 
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case (Table H]). 



Method 


AJ^DG / atom 


Absolute AF 


Relative AF 


Adaptive 


1.3 X 10-3 


1.1 X 10-^ 


6.3 X 10-2 


Optimized 


1.8 X 10-^ 


5.3 X 10-*^ 


3.3 X 10-3 



Table 4: Mean error of the Helmholtz free energy per atom, the absolute error of the force of 
the first atom, and the relative error of the force of the first atom for a metallic system with 
a defect. 8 basis functions per atom are used for both the adaptive local basis functions and 
the optimized local basis functions. 

Finally, we compare the performance of the adaptive local basis functions and the op- 
timized local basis functions for systems of increasing size with 8, 16, 32, 128, 256 atoms, 
respectively. The system is randomly perturbed by 0.2 au from the crystalline configuration, 
with a defect introduced at one atom of the potential. The computational time for con- 
structing the adaptive local basis functions (red dashed line with star) and for constructing 
the optimized local basis functions (blue solid line with triangle) are compared in Fig. [2] (a) 
plotted in logarithmic scale. 5 Newton steps and 30 GMRES iterations are used for the 
outer iteration and the inner iteration respectively in the optimization procedure. Since the 
optimized local basis functions use the adaptive local basis functions as an initial guess, the 
computational time for the optimized local basis functions also includes that for the adaptive 
local basis functions. The computational time for constructing both the adaptive local basis 
functions and the optimized local basis functions are linear thanks to the locality of the 
basis functions. The construction of the optimized local basis functions is 6 ~ 9 times more 
expensive than the construction of the adaptive local basis functions, indicating that the op- 
timization procedure should be further improved in order to generate a practically efficient 
optimized local basis set. The error of the Helmholtz free energy per atom and the error of 
the force on the first atom are shown in Fig. [2] (b) and (c), respectively. It is found that 
the Helmholtz free energy obtained by the optimized local basis functions is stably 8 ~ 9 
times more accurate than that obtained by the adaptive local basis functions. The ratio 
of improvement of the force has a much larger dependence on the realization of the atomic 
configuration which ranges from 5 ~ 170 times, with the average ratio of improvement being 
around one order of magnitude. 

3.3. Dynamic case 

The optimized local basis set is able to accurately compute the electron energy and the 
force using a small number of basis functions. Now we show that the optimized local basis 
functions can also be used in molecular dynamics. We illustrate the performance of the 
optimized local basis functions for molecular dynamics using the same metallic system as in 
Section 13.21 with 4 electrons and 8 basis functions per atom. 

In the Born-Oppenheimer approximation, the equations of motion for atom I are given 

by 

MiRi = Fj, J=l,---,iV^, (44) 
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The mass of the ions M/ is set to be 42000 which is close the mass of sodium in the atomic 
unit. Fi is the Hellman-Feynman force in (HTj) for atom J. The equations of motion fl44l) 
conserve the total energy given by 

Eic = + + ViAR). (45) 

The numerical conservation of the total energy is quantified by the drift of -E/c? which is 
defined as the relative difference of Eic along the trajectory, i.e. 



Velocity-Verlet scheme [20| is used to propagate the equations of motion for the atoms 
with the time step At = 1.21 femtoseconds (fs). The simulation length is 10000 steps and the 
total length of the simulation is 12.1 picoseconds (ps). To ensure the time-reversibility of the 
numerical scheme, the optimized local basis functions use the adaptive local basis functions 
as the initial guess at every time step. However, this is not a necessary requirement and 
can be improved by other time-reversible schemes such as the extended Lagrangian Born- 



Oppenheimer method [21|. The initial configurations of the atoms are perturbed by 0.2 au 
away from the equilibrium equidistant configuration, and the initial kinetic energy of the 
atoms is 1000 K with the mean velocity of all atoms (i.e. the velocity of the centroid) being 
zero. The error of the force and the error of the Helmholtz free energy per atom are well 
within 2.5 x 10~^ and 1.4 x 10~^, respectively (see Fig. |3] (a) and (b)), which is consistent 
with the behavior of errors in the static calculation. The Helmholtz free energy obtained 
from the optimized local basis functions is systematically higher than that in the benchmark 
planewave simulation. The sources of the systematic shift are the penalty parameter a in the 
DG formulation, and that the minimization procedure is restricted to an admissible set of 
the space spanned by the primitive functions. Nonetheless, the mean deviation of the force 
is unbiased, indicating that the structure of the trajectory obtained using the optimized 
local basis functions is well preserved. The drift of the conserved quantity ( 146|) is also well 
controlled within 5 x 10~^ (Fig. |3] (c)). 



4. Conclusion 

We have developed the optimized local basis set to solve models in the Kohn-Sham 
density functional theory for both insulating and metallic systems. The optimized local 
basis functions form an accurate basis set for computing the electron energy as well as 
the atomic force with a small number of basis functions per atom. When the optimality 
condition is achieved, the optimized local basis functions give the lowest energy among all 
the basis functions in an admissible set determined by the primitive basis functions. The 
force is accurately described by the Hellmann-Feynman force, and the contribution of the 
derivative of the basis functions [i.e. the Pulay force) vanishes automatically. The concept of 
the optimized local basis functions is quite general, and the methods developed in this paper 
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are useful for other problems such as selecting basis functions and evaluating parameter- 
dependent functions as well . 

To obtain the optimized local basis functions in practice, we proposed a preconditioned 
Newton-GMRES method. The resulting optimized local basis functions are tested using a 
one-dimensional model problem. We find that the optimized local basis functions accurately 
compute the Helmholtz free energy and the force using a very small number of basis functions 
per atom for both insulating and metallic systems. When applied to the molecular dynamics 
simulation, the optimized local basis functions do not exhibit any systematic drift in terms of 
the force or the total energy for the ionic degrees of freedom. Therefore the optimized local 
basis functions are able to give the correct statistical and dynamical properties along the 
molecular dynamics trajectory, and can be used for long time molecular dynamics simulation. 

The optimized local basis set provides an implementable criterion to eliminate the artifi- 
cial effect in the force due to the change of the basis functions and to maintain a small set of 
basis functions, which makes the optimized local basis set an ideal tool in the molecular dy- 
namics simulation. However, the construction of the optimized local basis functions is found 
to be already more expensive than other choices such as adaptive local basis functions, indi- 
cating that the optimization procedure should be further improved especially when applied 
to Kohn-Sham density functional theory in 3D. The more efficient scheme may be achieved 
by including a feasible approximation of the derivative of the density matrix with respect to 
the basis function, a more efficient preconditioner for the GMRES iteration, or even a more 
efficient gradient method instead of a Newton-type method. These will be our future work. 
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Appendix A. Finite temperature Kohn-Sham density functional theory 

In this appendix, we briefly described the basic formulation of the Kohn-Sham density 
functional theory [l], 0] and its finite temperature generalization. In the Kohn-Sham density 
functional theory, the ground state electron energy is written as 

Etot = ^tot({^i}) = 2^2 1^^^ I' + / KxtP dx + ^ 7^ ^ / b*e^Pi dx 

i=l e i=l 

\jj ##d^d» + j £„|pWldi, (A.l) 



+ 2 

where the Kohn-Sham orbitals are the solutions to the minimization problem 



min £'tot({^i}), 

r (A.2) 
s.t. ip*ipjdx = 5ij, z,j = l,---,N. 
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With slight abuse of the notation, we denote by {ipi} both the arguments in the minimization 
problem (lA.2p . and the solutions to the minimization problem, i.e. the Kohn-Sham orbitals. 



The electron density is p{x) = 1-1/^4(0:) We have neglected the spin degeneracy. The 

first term of flA.ll) is the kinetic energy. The second and third terms come from pseudo- 
potential, which we have taken the Kleinman-Bylander form j22|]. The pseudopotential is 
given by 

^ps = Kxt + '^'je\be){be\. 

I 

For each £, hn is a function supported locally in the real space around the position of one 
of the atoms, 7^ = +1 or —1, and we have used the Dirac bra-ket notation. The fourth 
term is the Coulomb interaction between electrons, and the fifth term is the exchange- 
correlation functional, for which the local density approximation (LDA) [2^ 



24| is adopted. 



The proposed method can also be used for more complicated exchange-correlation functionals 
such as the generalized gradient approximation (GGA) functionals (25| . 

The ground state electron energy defined in (lA.ip is applicable to insulating systems 
with large band gap, but is difficult to evaluate for zero-gap metallic systems. For metallic 
system, finite temperature KSDFT becomes the standard tool 26|], in which the Helmholtz 
free energy is considered instead. For given finite temperature T > 0, the Helmholtz free 
energy is given by 



^tot = ^tot({v^i},{/i}) = ^^/i y iv^ii^dx+ y Kxtpdx 

p[x)p{y) 



bl'ipi dx 



F - y\ 



dx dy 



J [p(x)] dx + In /. + (l-/.)ln(l -/,)). (A.3) 



Correspondingly {tpi} and {/j} are the solutions to the minimization problem 

,min J'tot(M},{/i}), 

{■>Pi},{fi} 

s.t. ij*ijjdx = 6ij, i,j = l,---,N. 



(A.4) 



Here /3 is the inverse temperature /3 = l/Zc^T. The number of eigenstates N is chosen 
to be slightly larger than the number of electrons in order to compensate for the finite 
temperature effect, following the criterion that the occupation number is sufficiently small 
(less than 10"^). {/j} G [0, 1] are the occupation numbers which add up to the total number 

of electrons A^ = J2iLi fi^ ^^d the electron density p = fil'^if- Compared to flA.ip . 

the only extra term is the last term, which characterizes the entropic contribution. 

The Kohn-Sham equation, or the Euler-Lagrange equation associated with ( ]A.4p reads 



Ves[p] + ^'ye\be){bi\)i/ji = 



(A.5) 
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where the effective one-body potential Vcs is given by 



Kfr[p](x) = Kxt(x) + / dy + e'Jpix)]. (A.6) 



F - y\ 



The occupation numbers are given by 



l + exp(/3(A.-/i))' ^^-^^ 

which is the Fermi-Dirac distribution evaluated at A^. Here fi is the chemical potential, which 
is chosen so that fi satisfies 

5^ /, = AT. (A.8) 

i 

Note that flXSjl is a nonlinear eigenvalue problem, as Ves depends on p, which is in turn 
determined by {ipi}. The electron density is self- consistent if both flA.51) and flA.6P are 
satisfied. After obtaining the self-consistent electron density, the Helmholtz free energy can 
be expressed as 

^tot = ^tot (P, /i) = J] f^K + J2 1^ + (1 - - f^)) 

i i 

^-^0^dxdy + 1 6.e[p(x)]dx- I e'JpixMx)dx. (A.9) 

The goal of finite temperature Kohn-Sham density functional theory is to calculate the free 
energy J-tot, the self-consistent electron density p and also the chemical potential p given 
the number of electrons, the temperature and the atomic configuration. The Helmholtz 
free energy J^tot(-R) plays the role of the electron energy E{R) in Section [1], and the force 
is defined as the negative gradient of the Helmholtz free energy F{R) = The 
Helmholtz free energy is applicable to both the insulating and the metallic systems. As 
T — )■ 0, the Helmholtz free energy J-'tot reduces to the ground state electron energy -Etot- 
Therefore flA.ip is also called the zero temperature KSDFT. 
As fi is given by the Fermi-Dirac distribution, we have 

E/^^* = TrT- ^77 (A.IO) 

^ 1 + exp{f3{H - p)) 

^ = l + exp(/3(i7-/i)) l + exp(/3(i7 -/.))' ^^'^^^ 

y (1 - /.) ln(l - /.) = Tr ^Mm -f^)) 1^ eMP{H -p)) 

Y 1 + exp{(3{H - p)) 1 + exp{^{H - p)) ^ ^ 

Using these, we can rewrite (I A. 9 1) as (see e.g. (27l |) 

J-tot(p, p) = Tr ln(l + exp(/3(/i - H[p]))) + pN 

1 f f p{x)p{y) 



2 



// + / exc[p(x)]dx-y"6;jp(x)]p(x)dx. (A.13) 
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One can verify by straightforward calculations that 

5-7^tot(p,yu) 



(A.14) 



Sp 

if p and p are the self-consistent solution of the Kohn-Sham equation (IA.5|) . Taking derivative 



of flA.lSp with respect to p, we have 



5-^tot(p,/i) _ Tr ^^P(/^(/^--^)) +N = o, (A.15) 



dp 1 + exp(/3(/i - H)) 

Therefore, the atomic force takes the form 

d J'tot (p, P, R) dTtot (p, At, i?) 



Tr 



dR OR 
1 

1 + exp{(3{H - p)) OR ' 



(A.16) 



This is known as the Hellman-Feynman theorem at finite temperature. 

The Kohn-Sham density functional theory is usually solved by using the self-consistent it- 
eration, where at each iteration, the electron density p is obtained from effective Hamiltonian 
Hcs- Given an effective potential Vcs, and hence the effective Hamiltonian 

= -|A + Kff + 5^7^|6£)(6£|, (A.17) 

e 

we find p from p(x) = /j where {ipi}^s are eigenfunctions of i^es, and the definition 

of {fi} follows (]A.7P and (lA.Sp . Note that the {ij^i} and {/i}'s minimize the variational 
problem 

J'c^^{m,{f^}) = lYl J fi\'^U^)fdx + J V,s{x)p{x)dx 

+ E E I I' + f^~' E (/^ 1^ + (1 - 1^(1 - /^)) ' (A.18) 

with the orthonormality constraints = 6ij. 
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(c) h (d) 04 



Figure 1: Comparison of the adaptive and optimized local basis functions for an insulating 
system with 1 electron per atom (spin neglected). The adaptive local basis functions (red 
dashed line) and optimized local basis functions (blue solid line) are sorted according to the 
Ritz value of the local Hamiltonian in ascending order. 
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8 16 32 64 128 256 8 16 32 64 128 256 8 16 32 64 128 256 

#Atoms T^Atonis #Atoms 

(a) Computational time (b) Error of the Helmholtz free en- (c) Error of the force 

ergy per atom 



Figure 2: (a) The computational time for solving systems of various sizes using adaptive 
local basis functions (red dashed line) and optimized local basis functions (blue solid line), 
(b) The error of the Helmholtz free energy per atom using adaptive local basis functions 
(red dashed line) and optimized local basis functions (blue solid line), (c) The absolute error 
of the force for the first atom using adaptive local basis functions (red dashed line) and 
optimized local basis functions (blue solid line). 




Figure 3: The error of the force of the first atom (a), the error of the Helmholtz free energy per 
atom (b) and the drift of the conserved quantity (c) along the trajectory of the MD simulation 
plotted every 0.12 ps. The system is metallic with 4 electrons and 8 basis functions per atom. 
The mean deviation of the force is unbiased. 
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